Mandarin-English Information (MEI)

نویسندگان

  • Helen Meng
  • Sanjeev Khudanpur
  • Douglas W. Oard
  • Hsin-Min Wang
چکیده

Mandarin-English Information (MEI) is one of the four projects selected for the Johns Hopkins University Summer Workshop 2000. We plan to develop technologies for using written queries to search spoken documents (cross-media) between English and Mandarin Chinese (cross-language). Our research focus is on the integration of speech recognition and machine translation technologies in the context of translingual speech retrieval. We plan to work on the problems of: (i) indexing Mandarin Chinese audio with word and subword units, (ii) translating variable-size units for cross-language information retrieval, and (iii) devising effective retrieval strategies for English text queries and Mandarin Chinese news audio.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-scale-audio indexing for translingual spoken document retrieval

MEI (Mandarin-English Information) is an English-Chinese crosslingual spoken document retrieval (CL-SDR) system developed during the Johns Hopkins University Summer Workshop 2000. We integrate speech recognition, machine translation, and information retrieval technologies to perform CL-SDR. MEI advocates a multi-scale paradigm, where both Chinese words and subwords (characters and syllables) ar...

متن کامل

Multi-scale retrieval in MEI: an English-Chinese translingual speech retrieval system

This paper presents a multi-scale retrieval approach in MEI (Mandarin-English Information), an English-Chinese cross-lingual spoken document retrieval (CL-SDR) system. It accepts an entire English news story (from newspaper text) as the input query, and automatically retrieves "relevant" Mandarin news stories (from broadcast audio). This allows the user to search for personally relevant content...

متن کامل

A Cross-Linguistic Study of Voice Onset Time in Stop Consonant Productions

This study examines voice onset time (VOT) for phonetically voiceless word-initial stops in Mandarin Chinese and in English, as spoken by 11 Mandarin speakers and 4 British English speakers. The purpose of this paper is to compare Mandarin and English VOT patterns and to categorize their stop realizations along the VOT continuum. As expected, the findings reveal that voiceless aspirated stops i...

متن کامل

Children’s Knowledge of Disjunction and Universal Quantification in Mandarin Chinese

Downward entailing linguistic environments license inferences from sets to their subsets. These environments also determine the interpretation of disjunction: Disjunction licenses a conjunctive entailment in the scope of downward entailing operators (Crain 2008, 2012). This leads to a striking asymmetry across languages in the interpretation of disjunction when it appears in the restrictor (dow...

متن کامل

Improving Language Models for Mandarin Conversational Speech Recognition with Web Data

Lack of data is a problem in training language models for conversational speech recognition, particularly for languages other than English. Experiments in English have successfully used webbased text collection targeted for a conversational style to augment small sets of transcribed speech; here we look at extending these techniques to Mandarin. In addition, we investigate different techniques ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000